A closed-form approximation for the median of 
the beta distribution 

Jouni Kerman 
November 1, 2011 

Abstract 

A simple closed-form approximation for the median of the beta distribution 
Beta(a, b) is introduced: (a — 1/3)/ (a + 6 — 2/3) for (a, b) both larger than 1 has 
a relative error of less than 4%, rapidly decreasing to zero as both shape parameters 
increase. 
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1 Introduction 

Consider the the beta distribution Beta(a, b), with the density function, 

T{a)T{b) 

The mean of Beta(a, b) is readily obtained by the formula a/ (a + b), but there is no 
general closed formula for the median. The median function, here denoted by m(a, b), 
is the function that satisfies. 



na)T{b)Jo ' ' 2 

The relationship m(a, 6) = 1 — m(fe, a) holds. Only for the special cases a = 1 or 
6 = 1 we may obtain an exact formula: m{a, 1) = 2^^/° and m(l, b) = 1 — 2^^/''. 
Moreover, when a — b, the median is exactly 1/2. 

There has be e n muc h literature about the incomplete beta function and its inverse 



(see e.g. iDutkal (Il981h for a review). The focus in literature has been on finding 
accurate numerical results, but a simple and practical approximation that is easy to 
compute has not been found. 

2 A new closed-form approximation for the median 

Trivial bounds for the median can be derived jPavton etallll989h. which are a conse 



quence of the more general mode-median-mean inequality dGroeneveld and Meeden , 
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Figure 1 : Relative errors of the approximation (a— l/3)/(a + 6 — 2/3)of the median of 
the Beta(a, b) distribution, compared with the numerically computed value for several 
fixed p = a/{a + b) < 1/2. The horizontal axis shows the shape parameter a on 
logarithmic scale. From left to right, p = 0.499, 0.49, 0.45, 0.35, 0.25, and 0.001. 



19771) . In the case of the beta distribution with 1 < a < 6, the median is bounded by 



the mode (a — l)/(a + 5 — 2) and the mean a/ {a + b): 

a — 1 / , X o, 
< m(a, b) < 



a + 6- 2 

For a < 1 the formula for the mode does not hold as there is no mode. If 1 < 6 < a, 
the order of the inequality is reversed. Equality holds if and only if a = 5; in this case 
the mean, median, and mode are all equal to 1/2. 

This inequality shows that if the mean is kept fixed at some p, and one of the shape 
parameters is increased, say a, then the median is sandwiched between p{a — l)/(a — 
2p) and p, hence the median tends to p. 

From the formulas for the mode and mean, it can be conjectured that the median 
m{a,b) could be approximated by m{a,b;d) ^ {a — d)/{a + b — 2d) for some 
d G (0, 1), as this form would satisfy the above inequality while agreeing with the 
symmetry requirement, that is, m{a, b;d) = 1 — m{b, a; d). 
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Figure 2: Relative errors of the approximation (a — 1/3)/ (a + & — 2/3) of the median 
of the Beta(a, b) distribution over the whole range of possible distribution means p = 
a/{a + h). The smaller of the shape parameters is fixed, i.e. for p < 0.5, the median 
is computed for Beta(a, a(l — p)/p) and for p > 0.5, the median is computed for 

Beta{bp/{l-p),b). 



Since a Beta(a, b) variate can be expressed as the ratio 71/(71 + 72) where 71 ~ 
Gamma(a) and 72 ^ Gamma(&) (bot h with unit scale), it is use ful to have a look at 



the median of the gamma distribution. iBerg and PedersenI (120061) studied the median 



function of the unit-scale gamma distribution median function, denoted here by M(a), 
for any shape parameter a > 0, and obtained M{a) — a — 1/3 + o(l), rapidly ap- 
proaching a — 1/3 as a increases. It can therefore be conjectured that the distribution 
median may be approximated by, 

/ M~ f h ^ /'}\ a - 1/3 a - 1/3 

m(a, 6) ^ m(a, b; 1/3) = _ 1/3) + (5 _ 1/3) = , + ^-2/3- 

Figure ([T]i shows that this approximation indeed appears to approach the numeri- 
cally computed median asymptotically for all distribution means p = a/{a + b) as the 
(smaller) shape parameter a — > 00. For a > 1, the relative error is less than 4%, and 
for a > 2 this is akeady less than 1%. 
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Figure 3: Logarithm of the scaled absolute error (distance) log(|TO(a, &; d) — 
m{a,b)\/p), computed for a fixed distribution mean p — 0.01 and various d. The 
approximate median of the Beta(a, b) distribution is defined as m{a, b; d) = {a — 
d)/{a + b— 2d). Due to scaling of the error, the graph and its scale will not essentially 
change even if the error is computed for other values of p < 0.5. The approxima- 
tion m(a, 6; 1/3) performs the most consistently, attaining the lowest absolute error 
eventually as the precision of the distribution increases. 



Figure shows the relative error over all possible distribution means p — a/{a + 
b), as the smallest of the two shape parameters varies from 1 to 4. This illustrates how 
the relative error tends uniformly to zero over all p as the shape parameters increase. 
The figure also shows that the formula consistently either underestimates or overesti- 
mates the median depending on whether p < 0.5 or p > 0.5. 

However, the function m{a, b; d) approximates the median fairly accurately if some 
other d close to 1/3 (say d = 0.3) is chosen. Figure (|3]l displays curves of the loga- 
rithm of the absolute difference from the numerically computed median for a fixed 
p — 0.01, as the shape parameter a increases. The absolute difference has been scaled 
by p before taking the logarithm: due to this scaling, the error stays approximately 
constant as p decreases so the picture and its scale will not essentially change even if 
the error is computed for other values of p < 0.5. The figure shows that although some 
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Figure 4: Tail probabilities Pr{d < m) of the Beta(a, b) distribution when m = {a — 
l/3)/(a + b — 2/3). As the smaller of the two shape parameters increases, the tail 
probability tends rapidly and uniformly to 0.5. 



approximations such as d = 0.3 has a lower absolute error for some a, the error of 
m(a, 6; 1/3) tends to be lower in the long run, and moreover performs more consis- 
tently by decreasing at the same rate on the logarithmic scale. In practical applications, 
d = 0.333 should be a sufficiently good approximation of d = 1/3. 

Another measure of the accuracy is the tail probability Pi{6 < m{a, b; 1/3)) of a 
Beta(a, b) variate 9: good approximators of the median should yield probabilities close 
to 1/2. Figure (01 shows that as long as the smallest of the shape parameters is at least 
1, the tail probability is bound between 0.4865 and 0.5135. As the shape parameters 
increase, the probability tends rapidly and uniformly to 0.5. 

Finally, let us have a look at a well-known paper tha t pro vides further support 
for the uniqueness of m(a, b; 1/3). IPeizer and Pratli (Il968h andEaS (Il968h provide 
approximations for the probability function Pr(6' < x) of a Beta(a, b) variate 6. Al- 
though they do not provide a formula for the inverse, i t is the probability function at 
the approximate median. According to Peizer and PrattI (Il968h . Pr{d < x) is well ap- 
proximated by $(z(a, b; x)) where $ is the standard normal probability function, and 
z is a function of the shape parameters and the quantile x. Consider m = rn{a, b; d): 
z{a, b; m) should be close to zero and at least tend to zero fast as a and b increase. Now 
assume that p is fixed, a varies and b = a(l — p)/p. The function z(a, 6; m) equals. 
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rewritten with the notation in this paper, 



1 - 2m 
(a - p)V2 



1/3- d- 



0.02p 



1 1 — dp/a 
.2^ Ml-P). 



l + /(a,p;rf) \ 
m(l — m) / 



1/2 



(2) 



where the function /(a, p; d) tends to zero as a increases, being exactly zero only when 
d = 1/2 or m = 1/2. It is evident that for the fastest convergence rate to zero, one 
should choose d = 1/3. This is of the order 0(a~^/^); if d ^ 1/3, for example if 
we choose the mean p as the approximation of the median {d = 0), the rate is at most 
0(a-V2). 
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